Learning Phrase Embeddings from Paraphrases with GRUs
نویسندگان
چکیده
Learning phrase representations has been widely explored in many Natural Language Processing (NLP) tasks (e.g., Sentiment Analysis, Machine Translation) and has shown promising improvements. Previous studies either learn noncompositional phrase representations with general word embedding learning techniques or learn compositional phrase representations based on syntactic structures, which either require huge amounts of human annotations or cannot be easily generalized to all phrases. In this work, we propose to take advantage of largescaled paraphrase database and present a pair-wise gated recurrent units (pairwiseGRU) framework to generate compositional phrase representations. Our framework can be re-used to generate representations for any phrases. Experimental results show that our framework achieves state-of-the-art results on several phrase similarity tasks.
منابع مشابه
Learning Composition Models for Phrase Embeddings
Lexical embeddings can serve as useful representations for words for a variety of NLP tasks, but learning embeddings for phrases can be challenging. While separate embeddings are learned for each word, this is infeasible for every phrase. We construct phrase embeddings by learning how to compose word embeddings using features that capture phrase structure and context. We propose efficient unsup...
متن کاملFuzzy paraphrases in learning word representations with a lexicon
We figure out a trap that is not carefully addressed in the previous works using lexicons or ontologies to train or improve distributed word representations: For polysemous words and utterances changing meaning in different contexts, their paraphrases or related entities in a lexicon or an ontology are unreliable and sometimes deteriorate the learning of word representations. Thus, we propose a...
متن کاملAdaptive Joint Learning of Compositional and Non-Compositional Phrase Embeddings
We present a novel method for jointly learning compositional and noncompositional phrase embeddings by adaptively weighting both types of embeddings using a compositionality scoring function. The scoring function is used to quantify the level of compositionality of each phrase, and the parameters of the function are jointly optimized with the objective for learning phrase embeddings. In experim...
متن کاملHitting the Right Paraphrases in Good Time
We present a random-walk-based approach to learning paraphrases from bilingual parallel corpora. The corpora are represented as a graph in which a node corresponds to a phrase, and an edge exists between two nodes if their corresponding phrases are aligned in a phrase table. We sample random walks to compute the average number of steps it takes to reach a ranking of paraphrases with better ones...
متن کاملUsing Word Embeddings for Improving Statistical Machine Translation of Phrasal Verbs
We examine the employment of word embeddings for machine translation (MT) of phrasal verbs (PVs), a linguistic phenomenon with challenging semantics. Using word embeddings, we augment the translation model with two features: one modelling distributional semantic properties of the source and target phrase and another modelling the degree of compositionality of PVs. We also obtain paraphrases to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.05094 شماره
صفحات -
تاریخ انتشار 2017